339 research outputs found

    Statistical structures for internet-scale data management

    Get PDF
    Efficient query processing in traditional database management systems relies on statistics on base data. For centralized systems, there is a rich body of research results on such statistics, from simple aggregates to more elaborate synopses such as sketches and histograms. For Internet-scale distributed systems, on the other hand, statistics management still poses major challenges. With the work in this paper we aim to endow peer-to-peer data management over structured overlays with the power associated with such statistical information, with emphasis on meeting the scalability challenge. To this end, we first contribute efficient, accurate, and decentralized algorithms that can compute key aggregates such as Count, CountDistinct, Sum, and Average. We show how to construct several types of histograms, such as simple Equi-Width, Average-Shifted Equi-Width, and Equi-Depth histograms. We present a full-fledged open-source implementation of these tools for distributed statistical synopses, and report on a comprehensive experimental performance evaluation, evaluating our contributions in terms of efficiency, accuracy, and scalability

    HIGGINS: where knowledge acquisition meets the crowds

    Get PDF
    We present HIGGINS, an engine for high quality Knowl- edge Acquisition (KA), placing special emphasis on its ar- chitecture. The distinguishing characteristic and novelty of HIGGINS lies in its special blending of two engines: An automated Information Extraction (IE) engine, aided by semantic resources, and a game-based, Human Computing engine (HC). We focus on KA from web data and text sources and, in particular, on deriving relationships between enti- ties. As a running application we utilise movie narratives, using which we wish to derive relationships among movie characters

    The role of geomorphic variability on floodplain function: an analysis of sediment and phosphorus deposition

    Get PDF
    Excess sediment and associated phosphorus contribute to poor water quality and induce harmful algal blooms in freshwater lakes, including Lake Champlain. Floodplains can slow flood waters and create a depositional environment for sediment and nutrients, reducing downstream fluxes. The capacity of floodplains to capture sediment and nutrients is poorly understood in the Lake Champlain Basin (LCB), limiting the efficacy of remediation work to reduce phosphorus loads. This project assisted with recent work that measured deposition on well-connected floodplains. This part of the project focused on characterization of flood-deposited sediments and evaluation of the controls on measured variability in sediment deposition on selected flood events in 2019 using a classification of geomorphic controls. These classifications represent differences in depositional processes. Floodplain sediment samples from 20 sites across Vermont were analyzed for mass, total phosphorus, and particle size. Floodplain sites were classified by specific stream power. Plots within each site were classified by local geomorphic features. These analyses were used to describe how the depositional setting relates to sediment, phosphorus, and particle size measured at the study sites. We found that medium energy floodplains (class B) had higher rates of sediment and total phosphorus (TP) deposition than low energy floodplains (class C). We also identified trends in sediment and TP deposition within sites, describing patterns associated with elevation profiles, distance from channel, and floodplain feature units. Results of this work will contribute to an improved understanding of how floodplains interact with river-transported sediment and associated nutrients during floods

    HIGGINS: where knowledge acquisition meets the crowds

    Get PDF
    We present HIGGINS, an engine for high quality Knowl- edge Acquisition (KA), placing special emphasis on its ar- chitecture. The distinguishing characteristic and novelty of HIGGINS lies in its special blending of two engines: An automated Information Extraction (IE) engine, aided by semantic resources, and a game-based, Human Computing engine (HC). We focus on KA from web data and text sources and, in particular, on deriving relationships between enti- ties. As a running application we utilise movie narratives, using which we wish to derive relationships among movie characters

    Equivariant Localization

    Full text link

    Debatable results of surgery for lung cancer in a patient with long existing pulmonary metastases from differentiated thyroid carcinoma

    Get PDF
    Introduction: The appropriate following treatment in a patient with a new presented non-small cell lung cancer (NSCLC) and history of chronic lung metastases of thyroid origin has never been reported. In such cases, the presence of long­standing thyroid metastatic disease with proven “limited malignant potential” could be considered as a minor treatment problem justifying one’s the decision to focus on the primary lung carcinoma as the only serious threat for the patient’s life.Case report: We report the surgical treatment of a new presented NSCLC in a patient with chronic lung metastases of thyroid origin and we present all the diagnostic, staging and treatment problems.Conclusion: The therapeutic results of our surgical approach were not encouraging. This could be owed to our staging prob­lems of NSCLC and the well documented limited immunological response of such patients with multiple neoplasms

    Streaming Weighted Sampling over Join Queries

    Get PDF
    Join queries are a fundamental database tool, capturing a range of tasks that involve linking heterogeneous data sources. However, with massive table sizes, it is often impractical to keep these in memory, and we can only take one or few streaming passes over them. Moreover, building out the full join result (e.g., linking heterogeneous data sources along quasi-identifiers) can lead to a combinatorial explosion of results due to many-to-many links. Random sampling is a natural tool to boil this oversized result down to a representative subset with well-understood statistical properties, but turns out to be a challenging task due to the combinatorial nature of the sampling domain. Existing techniques in the literature focus solely on the setting with tabular data residing in main memory, and do not address aspects such as stream operation, weighted sampling and more general join operators that are urgently needed in a modern data processing context. The main contribution of this work is to meet these needs with more lightweight practical approaches. First, a bijection between the sampling problem and a graph problem is introduced to support weighted sampling and common join operators. Second, the sampling techniques are refined to minimise the number of streaming passes. Third, techniques are presented to deal with very large tables under limited memory. Finally, the proposed techniques are compared to existing approaches that rely on database indices and the results indicate substantial memory savings, reduced runtimes for ad-hoc queries and competitive amortised runtimes

    EU-NICE, Eurasian University Network for International Cooperation in Earthquakes

    Get PDF
    Despite the remarkable scientific advancements of earthquake engineering and seismology in many countries, seismic risk is still growing at a high rate in the world’s most vulnerable communities. Successful practices have shown that a community’s capacity to manage and reduce its seismic risk relies on capitalization on policies, on technology and research results. An important role is played by education, than contribute to strengthening technical curricula of future practitioners and researchers through university and higher education programmes. In recent years an increasing number of initiatives have been launched in this field at the international and global cooperation level. Cooperative international academic research and training is key to reducing the gap between advanced and more vulnerable regions. EU-NICE is a European Commission funded higher education partnership for international development cooperation with the objective to build capacity of individuals who will operate at institutions located in seismic prone Asian Countries. The project involves five European Universities, eight Asian universities and four associations and NGOs active in advanced research on seismic mitigation, disaster risk management and international development. The project consists of a comprehensive mobility scheme open to nationals from Afghanistan, Bangladesh, China, Nepal, Pakistan, Thailand, Bhutan, India, Indonesia, Malaysia, Maldives, North Korea, Philippines, and Sri Lanka who plan to enrol in school or conduct research at one of five European partner universities in Italy, Greece and Portugal. During the 2010-14 time span a total number of 104 mobilities are being involved in scientific activities at the undergraduate, masters, PhD, postdoctoral and academic-staff exchange levels. This high number of mobilities and activities is selected and designed so as to produce an overall increase of knowledge that can result in an impact on earthquake mitigation. Researchers, future policymakers and practitioners build up their curricula over a range of disciplines in the fields of engineering, seismology, disaster risk management and urban planning. Specific educational and research activities focus on earthquake risk mitigation related topics such as: anti-seismic structural design, structural engineering, advanced computer structural collapse analysis, seismology, experimental laboratory studies, international and development issues in disaster risk management, social-economical impact studies, international relations and conflict resolution
    corecore